For folks used to modeling data in an RDBMS world, not having the same
tools (foreign keys, joins) in the Azure table world can be a bit of a
culture shock. One area where a lot of people have trouble is basic data
modeling.
Note: Steve Marx from the Windows Azure team suggested that some of the
information in this section be included in this book, and his code was
the source of ideas for some of the sample code shown here.
1. One-to-Many
When you model data, you often have a parent-child relationship,
or a one-to-many relationship. A canonical example is a customer-order
data model as shown in Figure 1 —a
customer “has” many orders, and you often do lookups where you want to
get all orders belonging to a single customer.
Let’s turn the diagram shown in Figure 11-3 into a model in Azure tables. The
following code shows a simple Customer data model. There’s nothing fancy about
it—it just represents some sample properties on the Customer and picks an arbitrary partitioning
scheme:
class Customer:TableServiceEntity
{
public Customer(string name, string id, string company,
string address):base(company, id)
{
this.Name = name;
this.ID = id;
this.Company = company;
this.Address = address;
this.PartitionKey = this.Company;
this.RowKey = this.ID;
}
public Customer() { }
public string Name { get; set; }
public string Company { get; set; }
public string Address { get; set; }
public string ID { get; set; }
}
Similarly, let’s define an Order entity. You would want to look up a given
customer’s orders quickly, so a customer ID makes a natural choice of
partition key, since you can always specify that in any queries you
make.
The following code shows an Order entity that takes the customer ID it
“belongs” to, as well as some other properties. For those of you who are
used to specifying foreign keys, note how the “foreign key” relationship
between customer and order is implicit in the fact that CustomerID is specified in the creation of
every order. However, as mentioned previously, there is no referential
integrity checking across tables. You could happily delete customer IDs
and the table service won’t warn you of dangling, orphaned OrderIDs:
class Order : TableServiceEntity
{
public Order(string customerID, string orderID, string orderDetails)
: base(customerID, orderID)
{
this.CustomerID = customerID;
this.OrderID = orderID;
this.OrderDetails = orderDetails;
this.PartitionKey = CustomerID;
this.RowKey = OrderID;
}
public string CustomerID { get; set; }
public string OrderID { get; set; }
public string OrderDetails { get; set; }
}
The final piece of the puzzle is to get all orders pertaining to a
given customer. There are a few ways you can do this.
The first is to store all the OrderIDs for a given Customer as a property of the Customer object as a serialized list. This has
the advantage of not having to do multiple queries—when you get back the Customer object, you already have the list of
orders as well. However, this is suboptimal for huge numbers of orders,
because you can store only a limited number of such IDs before you run
into the size limits on entities.
A better model is to add a helper method to the Customer entity class to look up all Order entities associated with it. This has
the overhead of adding another query, but will scale to any number of
orders. The following code shows the modification to the Customer class code. The code assumes a data
service context class that has properties corresponding to Customer and Order table name (not shown).
class Customer:TableServiceEntity
{
public Customer(string name, string id, string company,
string address):base(company, id)
{
this.Name = name;
this.ID = id;
this.Company = company;
this.Address = address;
this.PartitionKey = this.Company;
this.RowKey = this.ID;
}
public Customer() { }
public string Name { get; set; }
public string Company { get; set; }
public string Address { get; set; }
public string ID { get; set; }
public IEnumerable<Order> GetOrders()
{
return from o in new CustomerOrderDataServiceContext().OrderTable
where o.PartitionKey == this.ID
select o;
}
}
2. Many-to-Many
Another common scenario in modeling data is a many-to-many
relationship. This is best explained with the help of a sample model,
such as the one shown in Figure 2.
This model could form the basis of many social networking sites. It
shows two entities, Friend and
Group, with a many-to-many
relationship with each other. There can be many friends in a single
group (example groups being “School,” “College,” “Work,” and
“Ex-boyfriends”), and a friend can be in many groups (“Work” and
“Ex-boyfriends”).
The application may want to traverse this relationship in either
direction. Given a friend, it might want to display all groups she
belongs to. Similarly, given a group, it might want to list all the
friends you have in it.
Let’s start by creating some simple Friend and Group entity classes. They both define some simple
properties, and have a simple partitioning scheme. The partitioning
scheme isn’t important for the discussion here:
class Friend : TableServiceEntity
{
public string Name{get;set;}
public string FriendID {get;set;}
public string Details {get;set;}
public Friend(string id, string name, string details):base(name, id)
{
this.Name = name;
this.FriendID = id;
this.Details = details;
this.PartitionKey = Name;
this.RowKey = FriendID;
}
public Friend(){}
}
class Group : TableStorageEntity
{
public string Name { get; set; }
public string GroupID {get;set;}
public Group(string name, string id)
: base(id, id)
{
this.Name = name;
this.GroupID = id;
this.PartitionKey = id;
this.RowKey = id;
}
public Group() { }
}
How do you now represent the relationship between Friend and Group? The best way to deal with this is to
create a separate “join” table that contains one entity per one
friend-group relation. To look up all friends in a group, you just need
to query this table with that specific GroupID (and vice versa, for all groups a
friend belongs to). Following is the code for this simple table:
class FriendGroupRelationship : TableServiceEntity
{
public string FriendID { get; set; }
public string GroupID { get; set; }
public FriendGroupRelationship(string friendID, string groupID)
: base(friendID, groupID)
{
this.FriendID = friendID;
this.GroupID = groupID;
this.PartitionKey = FriendID;
this.RowKey = GroupID;
}
public FriendGroupRelationship() { }
}
In this code, you chose to partition based on FriendID. This means that querying all groups
to which a friend belongs will be fast, while the reverse won’t be. If
your application cares more about displaying all friends in a group
quickly, you can pick the reverse partitioning scheme, or create two
tables with two different partitioning schemes for the two
scenarios.
Note that when creating a new Friend or Group, you must add an entity to the join
table, and remove the entity when you delete the friend or the group.
Following is a code snippet that shows how that might look:
var id = new Guid().ToString();
var friend = new Friend(id,
"Jean Luc Picard", "Captain, U.S.S. Enterprise");
// Add Picard to a group
var friendgrouprelation = new
FriendGroupRelationship(id, "captains");
context.AddObject("Friend", friend);
context.AddObject("FriendGroupRelationship",
friendgrouprelation);